Checker Design Doc 1: Kick-Off

Pain points in Meridian architecture

Manually releasing rewards is tedious, and will be more so, once we have multiple subnets

All measurements need to fit into spark-evaluate process memory

Measurements are serialized as JSON, which is wasteful. A more dense format like CSV can save a lot of bandwidth & storage space, improve download speeds and reduce our Storacha bill.

We are not committing the evaluated measurements on the chain. Therefore, most data aggregation must happen inside spark-evaluate. As a result, we have two GH repositories sharing the same database - spark-evaluate writes the data & manages the DB schema, spark-stats reads the data.

The process for recording & publishing measurements uses Postgres as a persisted queue. We write 200k measurements every 20 minutes and delete them soon after they are recorded. This creates a lot of WAL (write-ahead-log) entries, we are seeing warnings about writing too much to WAL, Miroslav also suspect this is the reason why a failover of the primary server to the replica takes many minutes to complete.